Recursive Auto-encoders: An Introduction
I’ve talked a little bit about recursive auto-encoders a couple of posts ago. In the deep learning lingo, an auto-encoder network usually refers to an architecture that takes in an input vector, and through a series of transformations, is trained to reproduce that input in its prediction layer. The reason for doing this is to extract features that describe the input. One might think of it as a form of compression: If the network is asked to be able to reproduce an input with after passing it through hidden layers with a lot less neurons than the input layer, then some sort of compression has to happen in order for it to be able to create a good reconstruction. So let’s consider the above network. 8 inputs, 8 outputs, and 3 in the hidden layer. If we feed the network a one-hot encoding of 1 to 8 (setting only the neuron corresponding to the input to 1), and insist that that input be reconstructed at the output layer, guess what happens?